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[57] ABSTRACT 

The present invention is directed towards a means to detect 
and reorder out of order instructions that may violate data 
coherency. The invention comprises a mis-qucue table for 
holding entries of instruction data, each entry corresponding 
to an instruction in a computer microprocesor. The instruc- 
tion data in each entry comprises: i) address information for 
the instruction; ii) ordering information for the instruction, 
indicating the order of the instruction relative to other 
instructions in the mis-queue table; iii) data modification 
information for the instruction, for indicating a possibility of 
modified data; and iv) out of order information, for indicat- 
ing that a newer instruction has completed before the 
corresponding older instruction to the entry. The invention 
also comprises an out of order comparator for comparing an 
address of a completed instruction to any address informa- 
tion entries in the miss queue. If a completed instruction 
accesses the same address as another instruction, as indi- 
cated in the address information in the mis-queue table, and 
the completed instruction is newer than the matched 
instruction, the out of order field is marked indicating this 
condition exists. The invention comprises a modification 
comparator. This compares addresses from data altering 
events to those addresses in the entries in the mis-queue 
table. On a match, the modification field of the correspond- 
ing entry is marked to indicate this condition exists. When 
an instruction entry indicates that the corresponding instruc- 
tion's data is modified, and that the instruction is out of 
order, all subsequent instructions are canceled. 

7 Claims, 5 Drawing Sheets 
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APPARATUS AND METHOD FOR a cache request at the same lime. In the example shown, both 

TRACKING OUT OF ORDER LOAD instruction 2 and instniclion 1 arc attempting to access data 

INSTRUCTIONS TO AVOID DATA at address Al, and have submitted cache requests to cache 

COHERENCY VIOLATIONS IN A 100 lo do so. 

PROCESSOR s Since bank 130 only has one internal input port, both 

cache requests cannot be processed at the same time. This is 

HELD OF THE INVENTION due to the interleaved nature of cache 100. 

This invenUon relates generally to the field of computer "G- 2 shows what happens when cache request 2 

processors, and more particularly, to processors which arc ^^^^^ cache bank 130 before cache request I. Cache 

integrated on a single microprocessor chip. Still more ^^^^^^^t 2 hits m cache bank 130 for the data it needs, 

particularly, the invention relates to detection and correction However, cache request 1 cannot access cache bank 130 

of data coherency problems inherent in out of order u ntil at least the next cycle. Thus, newer mstrucUon 2 can get 



processing, especially in a multiple CPU system. 



the data it needs before older instruction 1 can. Newer 
instruction 2 can complete before older instruction 1 in this 

BACKGROUND OF THE INVENTION because of this port allocaUon conflict. 

The same ordering problem can occur when an older 
Providing ever faster microprocessors is one of the major insuiiction misses in the cache, and a newer instruction hits, 
goals of current processor design. Many different techniques ^ ^^^^ ^^^^ address of the data cannot be found 
have been employed to improve processor performance. memory raanagemcnl unit, and the memory managc- 
One technique which greatly improves processor perfor- 20 ^^^^ ^^^^ ^^^^ ^^^^^^ ^^^^ ^^^^ brought from 
mance is the use of cache memory. As used herem, cache ^-^^^^ memory. A hit occurs when both the address of the 
memory refers to a set of memory locations which are ^^^^ accessible through the memory man- 
formed on the microprocessor itself, and consequently, has agement unit and the cache, and this data can be output lo 
a much faster access lime than other types of memory, such instruction waiting for it 

as RAM or magnetic disk, which are located separately from 25 ^ ^^^^^ ^.^ ^^^^^ instruction foUowed by a 

the microprocessor chip. By slonng a copy of frequently ^^^^^ ^ ^^^^^ instruction, both attempting to access 

used dau in the cache, the processor is able to access the ^^^^ ^^^^ ^^^j ^^^^ ^^^^ 

cache when it needs this data, rather than havmg to go off ^ represented by two different effective addresses. When the 

chip- to oblam the mformation, greaUy enhanang the pro- ^^^^.^^ ^^^^^^ ^^^^^^^^ ^^^^^ instruction and its 

cesser s performance. ^^^^ ^^.^ already accessible by the memory management unit 

However, certain problems are associated with cache ^nd the cache, and where the older instruction address and 

memory. In particular, a great problem exists when multiple ^^^^ accessible in the memory management unit and 

processors are employed in a system and need the same data. j^c cache, this also leads to a situation where a newer 

In this case, the system needs to ensure that the data being insUiiction accessing the same data as an older instruction 

requested is coherent, that is valid for the processor at that ^ ^.^jj complete before the older instruction, 

time. Another problem exists when the data is stored in the multi-processor systems, a cache miss in one processor 

cache of one processor, and another processor is requesting ^^^^^^ ^ „^^^^p. ^^^^^^^ ^^^^^ processors in the ' 

the same information. system. This snoop request indicates to the other processors 

Superscalar processors achieve performance advantages that the data being "snooped" is being requested by another 

over conventional scalar processors because they allow processor, and the other processors should determine 

instructions lo execute out of program order. In this way, one whether the address being sought resides in their own cache, 

slow executing inslmction will not hold up subsequent If it is, the main memory data should be made coherent, that 

instructions which could execute using other resources on updated to reflect the correct current state of the system 

the processor while the stalled instruction is pending. state. 

In a typical architecture, when an instruction requires a in terms of superscalar architecture, this problem is corn- 
piece of data, the processor goes first to the onboard cache pounded by the fact that any loads may be finished out of 
to see if the data is present in the onboard cache. Some order, or in other words, a newer instruction may be marked 
caches have two external ports, and the cache can be for completion before an older one. That is, a newer insU^c- 
interleaved. This means that, for example in FIG. 1, a cache tion may be marked as set to execute before an older one is, 
100 has two cache banks, 140 and 130. One cache bank Jhus, two load instructions may address the same cache 
could be for odd addresses and the other cache bank would location, and the newer instruction may actually be fur- 
then be for even addresses. nishcd with a piece of data before the older instruction. 

Internally, each cache bank 140 and 130 cache has an Thus, the newer insUiiction be marked for completion out of 

internal input port (not shown) to which address information 55 order possibly causing false data to be used in the complc- 

of a cache request is made. In FIG. 1, the data for address tion of the instruction. When a later load instruction 

Al is stored on cache line 110 in cache bank 130, and the bypasses an earlier load instruction, the earlier load instruc- 

data for address A2 is stored on cache line 120 in cache bank tion may get newer data than it should have received based 

140. Cache 100 has two external ports for input data, port on the original program order. 

180 and port 190. 60 Previous solutions to this coherency problem include the 

Cache request 1 shows a cache request for an instruction one detailed in U.S. patent application, Ser. No. 08/591,249 

1 (not shown), and Request 2 shows a cache request for filed Jan. 18, 1996, now U.S. Pat. No. 5,737,636, entitled A 

instruction 2 (not shown). Instruction 1 is an older insiruc- Method and System for Bypassing in a Load/Store Unit of 

tion than instruction 2, meaning it should be executed before a Superscalar Processor. In this solution, a Lxiad Queue held 

instruction 2, If a superscalar processor has multiple load 65 a page index and a real address along with a an ID and a 

units, such as in the PowerPC" processor from IBM valid bit. The ID indicated the program order of the load 

Corporation, Austin, Tex., then both instructions could make instruction. 
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in addition to the aforementioned entries, the Load Queue If a present instruction is completed, the address of the 

entry also held a modified field which indicates whether the data is checked against the entries in the mis-queue table for 

cache line entry for the address has been modified. When a a match. If a match is present, the matching entries in the 

cache access, such as a store instruction or a snoop request, mis-queue tabic arc mariccd as out of order. That is, a newer 

indicates that the cache line may have been modified, ihe 5 instruction accessing data from an address has been set for 

Load Queue is searched. If it contains an entry for the same completion earlier than an older instruction to the same 

line, the modified bit is set to indicate a possible modifies- address, then the older instruction is completed out of order, 

lion. and its entry in the mis-queue table should be marked as 

Any subsequent load would perform a comparison of the ^^^h. 

Load Queue entries. If the same line is pending in the Load Further, when the cache returns valid data for an instruc- 

Queue and marked as modified, the ID field is checked. If tion having an entry in the mis-queue table for awhile, the 

the current line is older than that which was pending and instruction corresponding to that entry is set for completion, 

modified, the pending loads in the Load Queue are canceled A simitar search is made in the mis-queue table for older 

and re-executed after the subsequent load. This avoids the instruction entries corresponding to instructions accessing 

problem of having the older load gel newer data than the 15 data at that address thai have not yet completed. If any 

newer load. matches are found, any matching older entries will be 

marked as out of order. 

SUMMARY OF THE INVENTION if , jata coherency altering event happens, such as a 

This invention provides a novel means to eliminate the snoop request, the mis-queue tabic is interrogated. Any entry 

load queue and only cancel the instruction that may have same address of the data coherency altering event 

finished with the wrong data. ^ marked as modified. 

The invention provides a mis-queue table to hold any ^^^^ ^" instruction entry is released from the mis-queue 

rejected attempts to access the cache, or any other reason ^^e instruction is completed. The processor then 

that an instruction cannot be completed. 25 ^^^^^i^^s if certain events have occurred. If an instruction 

In the preferred embodiment, all instnictions create an ^"^^ ^" "^^"f^"^ table indicates that the JnstiTiction 

initial entry in the mis-queue table. If the data for the correspondmg to that entry is bod^ out of order and modified 

instruction is in the cache and available, the instruction entry ^""l instructions that are supposed to execute after that out of 

in the mis-queue table is taken out of the mis-queue table. '"^^ T^'^"^ mstniction will be canceled and 

Tu J »L' u- .u • . Li in re -executed, thus preservmg data coherency. 

The processor does this after searchmg the mis-queue table 30 » r & j 

for possible ordering problems. The instruction is then gR,gp DESCRIPTION OF THE DRAWINGS 
marKcd for completion by the sequencing unit of the 

processor, using the data found in the cache. FIG 1 is a diagram of an interleaved cache, showing two 

An ordering problem occurs when an older iastruction is instructions attempting to access the same data on one of (he 

completed after a newer instruction, and the newer and older cache banks. 

instructions access data at the same address. FIG. 2 is a diagram of a newer instruction accessing the 

If the data for the instruction is not available in the cache, t*ata prior to an older instruction in the cache of FIG. 1, and 

or otherwise unable to complete, the instruction entry ere- how an out of order completion may occur, 

ated in the mis-queue tabic stays in the mis-qucuc table until FIG. 3 is a block diagram of a superscalar processor, 

the instruction is ready to complete. When the instruction is FIG. 4 is a block diagram of a load circuit in a superscalar 

ready to complete, such as when data is ready in the cache, processor 

the instruction is then marked for completion. The insiruc 5 ]^ ^ ^. ^ mis-qu^Mt table according to the 

lion entry is then deleted from the mis-queue table. p^^j^^^j embodiment of Ihe invention. 

When an instruaion is i^ady to complete, the associated ,5 flG, 6 Is a diagram of an entry in mis-queue table 

entry is deleted from the mis-queue table. A search is made showine the fields 

of the mis-queue table for any previous instruction entries r-ir^^ ^ o n j m j- 

involving the same address. If any previous instruction entry ^^^^ ^» diagram how an embodiment of the 

is found with the same address as the completing entry, the P/^^"* invention detects out of order instruction comple- 

previous instruction entries are marked as being out of order, *^°"* 

since they have not completed yet. As noted before, this can FIGS, llfl, lib, 11c, and Ud diagram how an embodi- 

happen, for example, when the instructions have aliased the n^^ni of the invention works, 
same data address, and the older instructions have not been 

notified by the cache that the data is ready for use. DETAILED DESCRIPTION OF THE 

INVFNTTON 

It should be noted that whether all instructions create 55 n^rti^nwi^ 

entries in the mis-queue UbIe, and instruction entries ready piG. 3 is a block diagram of a processor system 10 for 

to complete are pulled out on the next cycle, or whether only processing information in accordance with the present 

rejected instructions create entries in the mis-queue table, invention. In the preferred embodiment, processor 10 is a 

the functional result is the same. The result is that instruction single integrated circuit superscalar microprocessor, such as 

entries corresponding to instructions that complete imme- 50 the PowerPC^" processor from IBM Corporation, Austin, 

diately do not stay in the mis-queue table, and only insiruc- Tex. Accordingly, as discussed further hereinbelow, proces- 

tion entries corresponding to instructions that do not com- sor 10 includes various units, registers, buffers, memories, 

plete immediately remain in it. and other sections, all of which are formed by integrated 

This eliminates the need to have a load queue since all circuitry. Also, in the preferred embodiment, processor 10 

present instructions are either deemed valid and set to be run 65 operates according to reduced instruction set computing 

or have a corresponding entry placed into a mis-queue table ("RISC") techniques. As shown in FIG. 3, a system bus 11 

to wait for data. is connected to a bus interface unit ("BIU") 12 of processor 
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10. BIU 12 controls the transfer of information between In the preferred embodiment, each instruction requires 

processor 10 and system bus 11. one machine cycle to complete each of the stages of instruc- 

BlU 12 is connected to an instruction cache 14 and to a lion processing. Nevertheless, some instructions (e.g. com- 

daia cache 16 of processor 10. Instruction cache 14 outputs pl^x fixed pomt instructions executed by CFXU 26) niay 
instructions to a sequencer unit 18. In response to such 5 require more than one cycle. Accordingly, a variable delay 

instructions from instruction cache 14, sequencer unit 18 may occur between a parUcular mslrucUon's execution and 

selectively outputs instructions to other execution circuitry completion stages in response to the variation m time 

of processor 10. required for completion of preceding instructions. 

In addition to sequencer unit 18 which includes execution I" r^P^^sc to a Load instruction, LSU 28 mputs infor- 

units of a dispatch unit 46 and a completion unit 48, in the mauon from data cache 26 and copies such informauon to 

preferred embodiment the execution circuitry of processor selected ones of rename buffers 34 and 38. If such mfor- 

10 includes multiple execution units, namely a branch unit «^^^on is not stored in data cache 16 hen data cache 16 

20, a fixed point unit a ("FXUA") 22, a fixed point unit B ^P^^ (through BIU 12 and system bus 11) such mformaUon 

("FXUB") 24, a complex fixed point unit ("CFXU") 26. a ^om a system memory 39 connected to system bus 11. 

load/store unit ("LSU") 28 and a floating point unit C'FPU") Moreover, data cache 16 is able to output (through BIU 12 

30 FXUA22 FXUB 24, CFXU 26 and LSU 28 input their and system bus 11) information from data cache 16 to 

source operand information from general purpose architec- ^V^'^^ ^^^^^ ^9 connected to system bus 11. 

tural registers ("GPRs") 32 and a fixed point rename buffers Referrmg now to FIG. 4, there is shown a schematic 

34 Moreover, FXUA22 and FXUB 24 input a "carry bit" diagram iUustrating a circuit for processing instructions, 
from a carry bit ("CA") register 42. FXUA 22, FXUB 24, ^° such as a load, according to an embodiment of the invention. 

CFXU 26 and LSU 28 output results (destination operand An address to the data unit 204 which contains the control 

information) of their operations for storage at selected logic required to physically access the cache 206. Cache 206 

entries in fixed point rename buffers 34. Also, CFXU 26 has an output port connected to, in this case, a 64-bit data 

inputs and outputs source operand information and destina- line which passes dau from the cache 206 into the Formatter 

Uon operand information to and from special purpose reg- 210 to be processed, if the data is in the cache, 

isters ("SPRs") 40. In one embodiment of the invention, each time an instruc- 

FPU 30 inputs its source operand information from float- lion is dispatched an enU-y in a mis-queue Uble 600 is 

ing point architectural registers ("FPRs") 26 and floating created. If the instruction hits in the data cache, then on the 

point rename buffers 38. FPU 30 outputs results (destination following cycle the entry for that instruction is removed 

operand information) of its operation for storage at selected from the mis-qucue table 600. However, if the mstruction 

entries in floating point rename buffers 38. misses in the data cache, then its real address, and possibly 

Sequencer unit 18 inputs and outputs information to and its effective address, and other infomiation remains in mis- 

from GPRS 32 and FPRs 36. From sequencer unit 18, branch qu^ue table 600. The processor contmually scans the address 

unit 20 inputs instructions and signals indicating a presem ,s °" '"^^'^^ »he misKjueue Uble and each 

state of processor 10. In response to such instructions and cycle the processor attempt to access the cache at he 

signals, branch unit 20 output (to sequencer unit 18) signals elective address stored m mis-queue Uble ^OO- Evenma ly, 

indicating suitable memory addresses storing a sequence of .^^ta becomes available m the cache for each of he 

instructions for execution by processor 10. In response to entnes in the mis-queue uble 600 and is passed onto the 

such signals from branch unit 20, sequencer unit 18 inputs formatter to be processed. 

the indicated sequence of instructions from instruction cache It should be noted, that instead of accessmg the cache 
14 If one or more of the sequence of instructions is not through the effective address, as explained above, a micro- 
stored in insu-uction cache 14, then insUiiction cache 14 processor can attempt to access the cache via a real address 
inputs (through BIU 12 and system bus 11) such instructions stored in the cache. It should be noted that this is a matter 
from system memory 39 connected to system bus 11. 45 of implementation, and does not affect the overall invention. 

In response to the instructions input from instruction However, it should be noted that a present instruction 

cache 14, sequencer unit 18 selectively dispatches through a need not initially be placed in mis-queue table 600 for the 

dispatch unit 46 the instructions to selected ones of execu- current invention to work. The present instrucUon need only 

tion units 20, 22, 24, 26, 28 and 30. Each execution unit be represented in the mis-queue table 600 when, for what- 

execuies one or more insunictions of a particular class of ever reason, the present instruction cannot be set to complete 

instructions. For example, FXUA 22 and FXUB 24 execute immediately after it is initially introduced by the sequencing 

a first class of fixed point mathematical operations on source unit. 

operands, such as addition, subtraction, ANDing, Oring and In the preferred embodiment, insU^ction entries are stored 
XORing. CFXU 26 executes a second class of fixed point in order in mis-queue Uble 600. This is diagramed in FIG. 
operations on source operands, such as fixed point multipli- 55 5. where the instruction generating instruction entry 410 is 
cation and division. FPU 30 executes floating point opera- an older instruction than the insUoiclion generating instruc- 
tions on source operands, such as floating point multiplica- tion entry 420, thus instruction entry 410 is stored higher in 
tion and division. the mis-queue table 600 than instruction 420. However, it 
Processor 10 achieves high performance by processing should be noted that with appropriate identification and 
multiple instructions simuluneously at various ones of 60 ordering information, instmctions could be stored in the 
execution units 20, 22, 24, 26, 28 and 30. Accordingly, each mis-queue table in no particular order, 
instruction is processed as a sequence of stages, each being An entry in the mis-queue table is represented ill FIG. 6. 
executable in parallel with stages of other instructions. Such The minimal information necessary for the invention is an 
a technique is called "pipelining". In a significant aspect of address information field 510, an out of order information 
the preferred embodiment, an instruction is normally pro- 65 field 520, and a modified data information field 530. The 
cessed at six stages, namely fetch, decode, dispatch, execute, address information can also have subfields, including real 
completion and writeback. address information 540, and effective address information 
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550. The mis-queue lable entry may also have other infor- earlier entries for an address comparable to new instruction 

mation associated with it, such as a valid field 560, and an entry address field 650. If an older instruction entry going to 

instruction id field 570, which could be used as ordering the same address as found in new instruction entry 610» the 

information. ^^^^^ instruction entry is marked as out of order, as shown 

In another embodiment, the entries in the mis-queue table 5 FJp- . . - i ^ . ■ 

are stored out oforder in the mis-queue lable. -nie valid field When new instruction 800 is completed lU entry in 

560 in FIG. 6 indicates whether the mis-queue lable entry is mis-qucuc 600 is deleted In the preferred embodiment all 

. y \ t\ • u . ui i:nA A V others entries move up, thus preservmg the timmg order of 

indeed stiU .n the mis-queue table 600. A new entry is ^^^^ ^^^^ ./^j^g P 

created in the first mis-qucue table line not havmg the vahd ^^^^ ^ ^ instruction, the entry 

field 560 set. Ordering information is preserved with the use lO .^^7^^^ ^^^^ ^^^^^^ ^^^^ ^^^^^.^^ 

of instrucUon id field 570. ^^^^ change, occurs whUe an instruction entry is in mis- 

Tuming now to FIG. 7, a new instruction 800 accessing quQxxe table 600, modified indicator 530 of the correspond- 

data at address D is initially presented by the processor. In instruction entry will be set to indicate a potential 

the preferred embodiment, an entry 610 is created for new problem. The processor scans the entries and mailcs each 

instruction 800 in the next available slot in the mis-queue ^jtry that has an address that indicates the same address on 

table 600. If new instruction 800 completes immediately, the a data change as being modified. 

processor checks the older entries 620, 630, and 640 in When data coherency has possibly been violated, both the 

mis-queue lable 600 against new instruction 800. out of order indicator and the modified indicator will both 

Specifically, the address of the new instruction 800 is s^ow that these events have occurred. This means a newer 
checked against the address information 510 of all older ^ instruction which has been set for completion earlier than an 

entries in mis-queue table 600 for a comparable address. By oijer instruction, may have older data associated with it than 

definition, all entries contained in mis-queue table 600 must oijcr instruction. This would probably violate data 

correspond to older instructions than new instruction 800. coherency. 

It should be noted that if new instruction 800 completes pics. Ua, 116, Uc, and lid show a sequence in which 

onthenexicycle, entry 610 would be deleted. It should also ^^lis problem is delected. Mis-queue lable 600 has initial 

be noted that entry 610 need not be created immediately. The entries 1110, 1120, and 1130, corresponding to instructions 

operand address of new instruction 800 could be checked ^210, 1220, and 1230 respectively. New instruction 1240 is 

against the address information in the entries in mis-qucue initiated going to memory location a, and its data is found in 

table 600, and only if new instruction 800 does not complete (he cache. Entry 1110 is then marked as out of order when 

with the next clock cycle would entry 610 be created for it. new instruction 1240 is completed before it in FIG. lib. 

In an alternative embodiment, entry 610 would not be Snoop request 1140 is made indicating that memory location 

deleted, but would have the valid field set to indicate it was a has changed in FIG. He. Upon interrogating mis-queue 

no longer a valid mis-qucue table entry. A newer entry could table 600, entry 1110 is marked as modified. Thus, entry 
then use this line in mis-queue table 600. 35 1110 indicates a data coherency problem with the data 

Suppose the data for new instruction 800 is present in the coming from a. 

cache. Then, new instruction 800 is set for completion with It should be noted that new instruction 1240 initiated an 

the da la found for it in the cache. The address that new entry 1150 in the mis-queue lable 600 even when it cleared 

instruction 800 is supposed to use is then compared against out immediately. It should be noted that if new instruction 

address information 510 in all older enu-ies in the mis-queue 1240 missed in the cache, initiated entry 1150 would remain 

table 600. If any matches to the new instruction 800's on the mis-queue table 600. It should also be noted that entry 

address are comparable to an address information field 510 1150 could be created after the miss. The important thing is 

in any entry corresponding to an earlier instruction in that new instruction 1240's operand address be compared to 

mis-queue table 600, those matching entries are marked as the entries in the mis-queue table, and any matches should 
out of order Out of order means that a newer instruction has ^5 be marked if new instruction 1240 completes before the 

been marked for completion that uses the data at an address instructions corresponding to older entries 1110, 1120, and 

that an older instruction, still in the mis-queue table, is also 1130. 

to use. Out of order field 520 is used to mark an entry as out When an instruction entry is shown to be modified and out 

of order. of order, as in the case of entry 1110, the newer instruction 

FIG. 8 shows the mis-queue table 600 after instruction 50 causing the older instruction entry to be marked as out of 

800 has completed on the next clock cycle. Note thai order cannot be run. In the preferred embodiment, when the 

instruction entry 620 has been marked as out of order using instruction corresponding to entry 1110 is completed, the 

out of order field 520, since it has not completed and it mis-queue table reports to completion logic that a problem 

accesses data at the same memory location as new insiruc- has occurred with the instruction corresponding to this entry, 
tion 800. 55 The instruction corresponding to entry 1110 is completed 

If new instruction 800 has to wait on the data from the and allowed to execute. However, all instructions following 

cache, it enters the mis-queue table as entry 610 and will that initiating entry 1110, namely instmclions 1220, 1230, 

remain there until it can complete, as shown in FIG. 9. FIG. and 1240, and any other instructions that completed and are 

9 also shows that several other instruction entries have been set to execute after the instruction corresponding to entry 
added to the mis-queue table 600 in the interim. At some 60 1110 are canceled and reset go through the entire run process 

later time, the data cache has the data ready for new entry again. 

610, but not matching entry 620. Thus, new instruction 800 In the preferred embodiment of the invention, the 

has completed. addresses are only compared to a set granularity. Thus, 

When new instruction 800 is completed, the processor addresses are only compared to the double word boundary to 
scans the entries in the mis-queue table which correspond to 65 indicate a potential problem. However, it should be noted 

instructions older than insuiiction 800, that is all entries that actual addresses could be compared, as well as other 

above corresponding entry 610. The processor searches granularities. 
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In the preferred embodimenl of ihe invenlion, after the 
error situation is detected, ail instructions after the instruc- 
tion indicating the problem are flushed and reset for execu- 
tion. Thus data coherency is preserved in the flush of the 
remaining instructions. s 

What is claimed: 

1. An apparatus for out of order execution of instructions, 
the apparatus comprising: 

(a) a mis-qucuc table for holding entries of instruction 
data, each entry corresponding to an instruction in a 
computer microprocessor, the instruction data compris- 
ing: 

(i) address information for the instruction, the address 
information being held in an address information 
field of the respective entry; 

(ii) order information for the instruction, the order 
information being held in an order information field 
of the respective entry for indicating the order of the 
corresponding instruction in relation to other instruc- 
tions; 20 

(iii) out of order information, the out of order infor- 
mation being held in an out of order information field 
of the respective entry for indicating that a newer 
instruction using data at the address corresponding to 
the address information field has completed before ^ 
the current entry; 

(iv) data modification information for the instruction, 
for indicating a possibility of modified data at the 
address corresponding to the address information 
field; ^0 

(b) an out of order comparator for setting the out of order 
information field of an entry in the mis-qucuc table 
upon the completion of a completing instruction, the 
out of order information field of a compared entry 
being set if the completing instruction comprises a '^^ 
newer instruction which uses data at the address cor- 
responding to the address information field in the 
compared entry; 

(c) a modification comparator for comparing address 
information in the address information field in an entry 
in the mis-queue table to a possibly modified address, 
wherein the modification field in the entry is marked to 
indicate modified data at the address if the possibly 
modified address is comparable lo the address infor- 
maiion in the instruction entry being compared. 

2. The apparatus of claim 1, wherein the apparatus cancels 
all instructions following an instruction corresponding lo an 
instruction entry that indicates an out of order instruction 
and modified data. 

3. The apparatus of claim 1 wherein the out of order 
comparator compares all data address information in an 
entry to determine if the addresses are comparable. 

4. The apparatus of claim 1 wherein the out of order 
comparator compares a portion of the data address informa- 
tion in an entry to determine if the addresses are comparable. 

5. A method for detecting out of order instructions which 
may cause a data coherency violation in a microprocessor, 
the method comprising: 
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(a) preparing a new instruction to execute on the 
microprocessor, the new instruction having a data 
address; 

(b) upon completion of ibc new instruction 

(i) comparing the data address of the instruction to 
existing entries in a mis-queue table, the instruction 
entries corresponding to previous instructions, the 
instruction entries containing address information, 
instruction order information, an out of order 
indicator, and a modified data indicator; 

(ii) if the data address of the new instruction is com- 
parable to address information of an entry in the 
mis-queue table, marking the comparable entry in 
the mis-queue table as an out of order instruction; 

(c) if the new instruction is presently unable to be 
completed, creating an entry for the new instruction in 
the mis-qucuc table, whereby the data address of the 
new instruction is put in the address information of the 
new instruction enu*y, and information on the order of 
the instruction is put in the ordering information of the 
new instruction entry. 

6. The method of claim 5 further comprising the steps of: 

(d) continuafly scanning the mis-queue table for entries 
corresponding to instructions set to execute; 

(e) when an instruction corresponding to an entry in the 
mis-qucuc tabic is set to execute: 

(i) comparing the address information of the entry 
corresponding to the instruction set to execute with 
the address information of entries in the mis-qucuc 
table corresponding to older instmctions than the 
instruction set to execute; 

(ii) if an entry is found in the mis-qucue tabic corre- 
sponding to an older instruction than the insUiiction 
set to execute, and if the address information of the 
entry corresponding to the instruction set to execute 
is comparable to the address information in the entry 
corresponding to the older instruction, the entry 
corresponding to the older instruction is marked as 
out of order; 

(iii) removing the entry corresponding to the instruction 
set to execute from the mis-queue table. 

7. The method of claim 6 further comprising the steps of: 

(f) continually scanning for data altering events; 

(g) when a data altering event happens: 

(i) broadcasting the address of the altered data to the 
mis-queue table; 

(ii) comparing the address of the altered data to address 
information in the entries of the mis-queue table; 

(iii) if the address of the altered data and the address of 
an entry in the mis-qucuc tabic arc comparable, the 
entry in the mis-queue table having the comparable 
address is marked as modified. 

***** 
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